Spectro-temporal features with distribution equalization

نویسندگان

  • Samuel K. Ngouoko M
  • Martin Heckmann
  • Britta Wrede
چکیده

We could show in the past that Hierarchical SpectroTemporal (HIST) features improve the performance of Automatic Recognition Systems (ARS) of speech in difficult environments when they are combined with conventional speech spectral features. The target here is to improve the noise robustness of the HIST features by investigating a channel distribution equalization in our feature hierarchy. Thereby, we determine the empirical cumulative distribution of the speech training data set, which is referred to as reference distribution. Afterwards, a distribution adjustment of the training as well as test data is performed with respect to the reference distribution. We carry out the above mentioned distribution equalization in the preprocessing step as well as after each feature extraction step of our HIST feature extraction framework. We evaluate the benefits of such an equalization in the HIST feature extraction process with different noise types.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain

This article presents a new feature extraction technique based on the temporal tracking of clusters in spectro-temporal features space. In the proposed method, auditory cortical outputs were clustered. The attributes of speech clusters were extracted as secondary features. However, the shape and position of speech clusters change during the time. The clusters temporally tracked and temporal tra...

متن کامل

Normalization of spectro-temporal Gabor filter bank features for improved robust automatic speech recognition systems

Physiologically motivated feature extraction methods based on 2D-Gabor filters have already been used successfully in robust automatic speech recognition (ASR) systems. Recently it was shown that a Mel Frequency Cepstral Coefficients (MFCC) baseline can be improved with physiologically motivated features extracted by a 2D-Gabor filter bank (GBFB). Besides physiologically inspired approaches to ...

متن کامل

Histogram Equalization Based Front-end Processing for Noisy Speech Recognition

In this paper, we present Gabor features extraction based on front-end processing using histogram equalization for noisy speech recognition. The proposed features named as Histogram Equalization of Gabor Bark Spectrum features, HeqGBS features are extracted using 2-D Gabor processing followed by a histogram equalization step from spectro-temporal representation of Bark spectrum of speech signal...

متن کامل

Multi-stream spectro-temporal features for robust speech recognition

A multi-stream approach to utilizing the inherently large number of spectro-temporal features for speech recognition is investigated in this study. Instead of reducing the featurespace dimension, this method divides the features into streams so that each represents a patch of information in the spectrotemporal response field. When used in combination with MFCCs for speech recognition under both...

متن کامل

Improved phoneme recognition by integrating evidence from spectro-temporal and cepstral features

Gabor features have been proposed for extracting spectro-temporal modulation information, and yielding significant improvements in recognition performance. In this paper, we propose the integration of Gabor posteriors with MFCC posteriors, yielding a relative improvement of 14.3% over an MFCC Tandem system. We analyze for different types of acoustic units the complementarity between Gabor featu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012